leetcode-笔记说明

对刷过的leetcode上的题目做个记录。很多是以前写的，就当复习了。贴贴代码，写写分析，AC过的题目有的又用其他语言实现了一遍，基本上相当于重做了一遍，日期全写成了2014-03-11，所以这些笔记都排在第一篇之前，hiahiahia

最开始的时候写了一个markdown格式的模板，有一个这样的markdown文件(gist 在国内被墙了)：

title: "这里是题目"
date: 2014-03-11 00:33:34
tags: [algorithms, leetcode]
---

### 描述
---
这里是描述
<!--more-->

### 分析
---
这里是分析

### 解决方案1(C++)
---
### 解决方案2(Java)
---
### 解决方案3(Python)
---

### 相关问题
---

### [题目来源]()

然后一道题一道题地复制粘贴，写了几道题就烦了，于是写了一个Python脚本解决这个问题。爬取leetcode的问题，自动建立md文件，问题的描述没有爬取，因为描述是会更新的，而且有些内容是html格式的，有的还附带图片，要转化成markdown格式有点麻烦。所以，既然每次都要写分析，贴代码进来，也不在乎这一个复制粘贴了，但tags, similar problems这块是随手爬取了，因为每次都拷贝的话太麻烦。

这里给出Python代码：

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import urllib2
import re
from bs4 import BeautifulSoup

import sys
reload(sys)
sys.setdefaultencoding( "utf-8" )

leetcode_md = u"""title: "%s"
date: 2014-03-11 00:33:34
tags: [algorithms, leetcode, %s]
---

### 描述
---
这里是描述
<!--more-->

### 分析
---
这里是分析

### 解决方案1(C++)
---
### 解决方案2(Java)
---
### 解决方案3(Python)
---

### 相关问题
---
%s
### [题目来源](%s)
"""

def get_tag_content(tag):
    u"""
    用于提取bs中tag.contents的内容
    """
    return "".join([unicode(x) for x in tag.contents])

def get_attr(dom, attr, defaultValue=""):
    u"""
    获取bs中tag.content的指定属性
    若content为空或者没有指定属性则返回默认值
    """
    if dom is None:
        return defaultValue
    return dom.get(attr, defaultValue)

leetcode_problems = 'https://leetcode.com/problemset/algorithms/'

html = urllib2.urlopen(leetcode_problems)
content = html.read()
soup = BeautifulSoup(content, 'lxml')

problem_list = soup.select('table.table tbody tr')
# print problem_list

for item in problem_list[:5]:
    soup = BeautifulSoup(str(item), 'lxml')

    problem_id = get_tag_content(soup.select('td')[1])
    problem_name = get_tag_content(soup.select('td a')[0]).replace(' ', '-')
    href = get_attr(soup.select('td a')[0], 'href')
    problem_href = 'https://leetcode.com' + href

    filename = 'leetcode-' + str(problem_id) + '-' + str(problem_name) + ".md"
    problem_name_md = 'leetcode-' + str(problem_id) + '-' + str(problem_name)
    html = urllib2.urlopen(problem_href)
    content = html.read()

    soup = BeautifulSoup(content, 'lxml')

    problem_tag_list = []
    similar_problem_list = []
    if len(soup.select('span.hidebutton')) > 0:
        problem_tag_list = soup.select('span.hidebutton')[0].select('a')
    if len(soup.select('span.hidebutton')) > 1:
        similar_problem_list = soup.select('span.hidebutton')[1].select('a')
    tags = []
    for tag_item in problem_tag_list:
        soup = BeautifulSoup(str(tag_item), 'lxml')
        tag = get_tag_content(soup.select('a')[0]).strip().replace(' ', '-')
        tags.append(tag)

    similar_problem = {}

    for similar_item in similar_problem_list:
        soup = BeautifulSoup(str(similar_item), 'lxml')
        similar_problem_name = get_tag_content(soup.select('a')[0]).strip()
        href = get_attr(soup.select('a')[0], 'href')
        similar_problem_href = 'https://leetcode.com' + href
        similar_problem[similar_problem_name] = similar_problem_href

    title = problem_name_md
    md_tags = ', '.join(tags)
    similar_problem_md = ''
    for key, value in similar_problem.items():
        similar_problem_md += ('['+key+']'+'('+value+')   \n')     # 加的两个空格是为了在md中显示换行
    now_leetcode_md = leetcode_md % (title, md_tags, similar_problem_md, problem_href)

    print(u"完成" + filename)
    f = open(filename, 'w')
    f.write(now_leetcode_md)
    f.close()

这样一来，就能自动生成所有的markdown文件了（啊哈哈哈哈我果然是天才）：

「Python will save the world. I don’t know how, but it will」

P.S. 博文中的代码一般情况下是最新的版本，git log 在这里

Update： 2016-10-23 开始写 leetcode 上数据库相关的题目。